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Abstract 

We consider the problem of learning the structure of Ising models (pairwise bi- 
nary Markov random fields) from i.i.d. samples. While several methods have 
been proposed to accomplish this task, their relative merits and limitations remain 
somewhat obscure. By analyzing a number of concrete examples, we show that 
low-complexity algorithms systematically fail when the Markov random field de- 
velops long-range correlations. More precisely, this phenomenon appears to be 
related to the Ising model phase transition (although it does not coincide with it). 



1 Introduction and main results 

Given a graph G = {V = [p], E), and a positive parameter 9 > {he ferromagnetic Ising model on 
G is the pairwise Markov random field 

over binary variables x = {xi,X2, ■ ■ ■ , Xp). Apart from being one of the most studied models in 
statistical mechanics, the Ising model is a prototypical undirected graphical model, with applications 
in computer vision, clustering and spatial statistics. Its obvious generalization to edge-dependent 
parameters dij, G i? is of interest as well, and will be introduced in Section [1.2.21 (Let us 

stress that we follow the statistical mechanics convention of calling ([T]i an Ising model for any graph 
G.) 

In this paper we study the following structural learning problem: Given n i.i.d. samples x^^\ 
xp'\. . ., x*^"-* with distribution ^G,e{ ■ ), reconstruct the graph G. For the sake of simplicity, we 
assume that the parameter 6 is known, and that G has no double edges (it is a 'simple' graph). 

The graph learning problem is solvable with unbounded sample complexity, and computational re- 
sources fV\. The question we address is: for which classes of graphs and values of the parameter is 
the problem solvable under appropriate complexity constraints? More precisely, given an algorithm 
Alg, a graph G, a value 6 of the model parameter, and a small (5 > 0, the sample complexity is 
defined as 

nAig(G,^)=inf{neIN: P„,G,e{Alg(x«, = G} > 1 - t^} , (2) 

where P„^G,e denotes probability with respect to n i.i.d. samples with distribution ^ic.e- Further, 
we let XAig(G, 9) denote the number of operations of the algorithm Alg, when run on nAig(G, 9) 
samplesQ 



'For the algorithms analyzed in this paper, the behavior of riAig and XAig does not change significantly if we 
require only 'approximate' reconstruction (e.g. in graph distance). 
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The general problem is therefore to characterize the functions nfi,\g{G,9) and XAig(G,6'), in par- 
ticular for an optimal choice of the algorithm. General bounds on nAig(G, 6) have been given in 
imO, under the assumption of unbounded computational resources. A general charactrization of 
how well low complexity algorithms can perform is therefore lacking. Although we cannot prove 
such a general characterization, in this paper we estimate riAig and XAig for a number of graph mod- 
els, as a function of 9, and unveil a fascinating universal pattern: when the model (O} develops long 
range correlations, low-complexity algorithms fail. Under the Ising model, the variables {xi\i^v 
become strongly correlated for 6 large. For a large class of graphs with degree bounded by A, this 
phenomenon corresponds to a phase transition beyond some critical value of 6 uniformly bounded in 
p, with typically ^crit < const. /A. In the examples discussed below, the failure of low-complexity 
algorithms appears to be related to this phase transition (although it does not coincide with it). 



1.1 A toy example: the thresholding algorithm 

In order to illustrate the interplay between graph structure, sample complexity and interaction 
strength 9, it is instructive to consider a warmup example. The thresholding algorithm reconstructs 
G by thresholding the empirical correlations 

5,, = -^xf^xf fori,jeK (3) 

1=1 

Thresholding( samples {a;*^^^}, threshold r ) 

1: Compute the empirical correlations {Cy }(j,j)eyxy; 
2: For each e X y 

3: If C,j > r, set (z, j) £ g; 



We will denote this algorithm by Th r(T) . Notice that its complexity is dominated by the computation 
ofthe empirical correlations, i.e. XThr(r) ~ 0{p^n). The sample complexity nThr(r) can be bounded 
for specific classes of graphs as follows (the proofs are straightforward and omitted from this paper). 
Theorem 1.1. If G has maximum degree A > 1 and if 9 < atanh(l/(2A)) then there exists 
T = t(9) such that 

Further, the choice t{9) — (tanhfl + (l/2A))/2 achieves this bound. 

Theorem 1.2. There exists a numerical constant K such that the following is true. If A > S and 
9 > K/A, there are graphs of bounded degree A such that for any t, rtThr(T) — oo- ff^^ 
thresholding algorithm always fails with high probability. 

These results confirm the idea that the failure of low-complexity algorithms is related to long-range 
correlations in the underlying graphical model. If the graph G is a tree, then correlations between far 
apart variables Xi, Xj decay exponentially with the distance between vertices i, j. The same happens 
on bounded-degree graphs if 6* < const. /A. However, for 9 > const. /A, there exists families of 
bounded degree graphs with long-range correlations. 



1.2 More sophisticated algorithms 

In this section we characterize XAig(G, 9) and nAig(G, 9) for more advanced algorithms. We again 
obtain very distinct behaviors of these algorithms depending on long range correlations. Due to 
space limitations, we focus on two type of algorithms and only outline the proof of our most chal- 
lenging result, namely Theorem |1.6l 

In the following we denote by di the neighborhood of a node i G G (i ^ di), and assume the degree 
to be bounded: \di\ < A. 



1.2.1 Local Independence Test 

A recurring approach to structural learning consists in exploiting the conditional independence struc- 
ture encoded by the graph |[1] |4] |5] 15) . 
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Let us consider, to be definite, the approach of |4|, specializing it to the model ([T]). Fix a vertex r, 
whose neighborhood we want to reconstruct, and consider the conditional distribution of Xr given its 
neighbor^ ^iG,e{xr\xQr)- Any change of Xi, i G dr, produces a change in this distribution which 
is bounded away from 0. Let ?7 be a candidate neighborhood, and assume U C dr. Then changing 
the value of Xj, j G U will produce a noticeable change in the marginal of Xr, even if we condition 
on the remaining values in U and in any W, \W\ < A. On the other hand, if C/ ^ dr, then it is 
possible to find W (with \ W\ < A) and a node i £ U such that, changing its value after fixing all 
other values inU UW will produce no noticeable change in the conditional marginal. (Just choose 
i E U\dr and W — dr\U). This procedure allows us to distinguish subsets of dr from other sets 
of vertices, thus motivating the following algorithm. 



Local Independence Test( samples {x^^^}, thresholds (e, 7) ) 
1: Select a node r G V; 

2: Set as its neighborhood the largest candidate neighbor U of 

size at most A for which the score function Score(?7) > e/2; 
3: Repeat for all nodes r E V; 



The score function Score( • ) depends on {{x^^^, A, 7) and is defined as follows, 

min max \fn.G.e{Xi = Xi\XT^ = Xt^,X_u = £(7}^ 

W,j Xi,x^,,x^,Xj 

^n,G,e{Xi = Xi\X_-^ = Xy^,X.u\j = ^u\j^^] . (5) 

In the minimum, < A and j E U. In the maximum, the values must be such that 

P7i,G,e{Kw = Xw,Ku = £[/} > 7/2, Vn,G£{X_W = ^ K.U\j = ^U\3^^3 = ^j) > 7/2 

IPn.G.e is the empirical distribution calculated from the samples {x^^^}. We denote this algorithm 
by lnd(e,7). The search over candidate neighbors U, the search for minima and maxima in the 
computation of the Score([/) and the computation offn,G,0 all contribute for xind(G, 0). 

Both theorems that follow are consequences of the analysis of lU. 

Theorem 1.3. Let G be a graph of bounded degree A > 1. For every 9 there exists (e, 7), and a 
numerical constant K, such that 

«ind(.,^) (G, e) < ^ log ^ , xind,.,,, (G, e)<K (2p)2^+i logp . 

More specifically, one can take e — j sinh(20), 7 = e"'''^^ 2" 
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This first result implies in particular that G can be reconstructed with polynomial complexity for 
any bounded A. However, the degree of such polynomial is pretty high and non-uniform in A. This 
makes the above approach impractical. 

A way out was proposed in Q. The idea is to identify a set of 'potential neighbors' of vertex r via 
thresholding: 

B{r) = {tEV:C„> k/2} , (6) 

For each node r G y, we evaluate SCORE ([/) by restricting the minimum in Eq. (|5]) over Vt^ C B{r), 
and search only over U C B{r). We call this algorithm lndD(e, 7, k). The basic intuition here is 
that Cri decreases rapidly with the graph distance between vertices r and i. As mentioned above, 
this is true at small 9. 

Theorem 1.4. Let G be a graph of bounded degree A > 1. Assume that 6 < K/ A for some small 
enough constant K. Then there exists e, 7, k such that 

ri|ndD(.,^,«)(G, 6) < 8(^2 + gA) log ^ , XlndD,,,,,., (G, 9) < if 'pA^^^ + K' Ap^ \ogp . 

More specifically, we can take k — tanhf?, € = j sinh(2(?) and 7 = e^'*'^^ 2^^^. 

^If a is a vector and i? is a set of indices then we denote by the vector formed by the components of a 
with index in R. 
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1.2.2 Regularized Pseudo-Likelihoods 

A different approach to the learning problem consists in maximizing an appropriate empirical likeli- 
hood function |]7]|8]|9][T0][T3l. To control the fluctuations caused by the limited number of samples, 
and select sparse graphs a regularization term is often added l7l[8ll9l [T0l[Tn[T2l[T3l . 

As a specific low complexity implementation of this idea, we consider the £i -regularized pseudo- 
hkelihood method of |i7J. For each node r, the following likelihood function is considered 

1 " 

= — VlogP„.G,e(4'^|xf;) (7) 
n ^ — ' - ^ 

1=1 

where = Xy\^ = {xi : i e ^ \ r} is the vector of all variables except Xr and P„,G,e is defined 
from the following extension of ([T]i, 



where 9_ = {9ij}i jev is a vector of real parameters. Model (H) corresponds to 9ij ~ 0, V(i; j) ^ E 
and % = 9, V(i, j) e E. 

The function L{9_; {x^^''}) depends only on . = {9rj, j G dr} and is used to estimate the neigh- 
borhood of each node by the following algorithm, Rlr(A), 

Regularized Logistic Regression( samples {x'^'}, regularization (A)) 
1: Select a node r E V; 

2: Calculate^ = arg min {L{9-,{x'''^H) + \\\9\\i}- 

3: If 9rj > 0, set {r,j) £ E; 



Our first result shows that Rlr(A) indeed reconstructs G if 9 is sufficiently small. 

Theorem 1.5. There exists numerical constants Ki, K2, K^, such that the following is true. Let G 
be a graph with degree bounded by A > 3. If 9 < Ki/A, then there exist A such that 

n^wwiG,9)<K2 9-'Alog^. (9) 

Further, the above holds with A — 9 A^^/^. 

This theorem is proved by noting that for 9 < Ki/A correlations decay exponentially, which makes 
all conditions in Theorem 1 of fT] (denoted there by Al and A2) hold, and then computing the 
probability of success as a function of n, while strenghtening the error bounds of [7 |. 

In order to prove a converse to the above result, we need to make some assumptions on A. Given 
9 > 0, we say that A is 'reasonable for that value of 9 if the following conditions old: (i) Rlr(A) 
is successful with probability larger than 1/2 on any star graph (a graph composed by a vertex r 
connected to A neighbors, plus isolated vertices); (ii) A < S{n) for some sequence d{n) j 0. 

Theorem 1.6. There exists a numerical constant K such that the following happens. If A > 3, 
9 > K/A, then there exists graphs G of degree bounded by A such that for all reasonable A, 
"'Rir(A)(G') =00, i.e. regularized logistic regression fails with high probability. 

The graphs for which regularized logistic regression fails are not contrived examples. Indeed we will 
prove that the claim in the last theorem holds with high probability when G is a uniformly random 
graph of regular degree A. 

The proof Theorem [L6l is based on showing that an appropriate incoherence condition is necessary 
for Rlr to successfully reconstruct G. The analogous result was proven in lfT4ll for model selection 
using the Lasso. In this paper we show that such a condition is also necessary when the underlying 
model is an Ising model. Notice that, given the graph G, checking the incoherence condition is 
NP-hard for general (non-ferromagnetic) Ising model, and requires significant computational effort 
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Figure 1 : Learning random subgraphs of a 7 x 7 (p = 49) two-dimensional grid from n — 4500 
Ising models samples, using regularized logistic regression. Left: success probability as a function 
of the model parameter 9 and of the regularization parameter Aq (darker corresponds to highest 
probability). Right: the same data plotted for several choices of A versus 9. The vertical line 
corresponds to the model critical temperature. The thick line is an envelope of the curves obtained 
for different A, and should correspond to optimal regularization. 



even in the ferromagnetic case. Hence the incoherence condition does not provide, by itself, a clear 
picture of which graph structure are difficult to learn. We will instead show how to evaluate it on 
specific graph families. 

Under the restriction A ^ the solutions given by Rlr converge to 9_* with n |7|. Thus, for large 
n we can expand L around 9* to second order in (9 — 9*). When we add the regularization term 
to L we obtain a quadratic model analogous the Lasso plus the error term due to the quadratic 
approximation. It is thus not surprising that, when A ^ the incoherence condition introduced for 
the Lasso in 1 14] is also relevant for the Ising model. 



2 Numerical experiments 

In order to explore the practical relevance of the above results, we carried out extensive numerical 
simulations using the regularized logistic regression algorithm Rlr(A). Among other learning algo- 
rithms, Rlr(A) strikes a good balance of complexity and performance. Samples from the Ising model 
(HJ where generated using Gibbs sampling (a.k.a. Glauber dynamics). Mixing time can be very large 
for 9 > 6'ciit, and was estimated using the time required for the overall bias to change sign (this is a 
quite conservative estimate at low temperature). Generating the samples {x*^^-*} was indeed the bulk 
of our computational effort and took about 50 days CPU time on Pentium Dual Core processors (we 
show here only part of these data). Notice that Rlr(A) had been tested in |7| only on tree graphs G, 
or in the weakly coupled regime 9 < 6'ciit- In these cases sampling from the Ising model is easy, but 
structural learning is also intrinsically easier 

Figure reports the success probability of Rlr(A) when applied to random subgraphs of a 7 x 7 
two-dimensional grid. Each such graphs was obtained by removing each edge independently with 
probability p ~ 0.3. Success probability was estimated by applying Rlr(A) to each vertex of 8 
graphs (thus averaging over 392 runs of Rlr(A)), using n — 4500 samples. We scaled the regular- 
ization parameter as A = 2 Xo9 {log p/n)^/^ (this choice is motivated by the algorithm analysis and 
is empirically the most satisfactory), and searched over Aq. 

The data clearly illustrate the phenomenon discussed. Despite the large number of samples 
n ^ logp, when 9 crosses a threshold, the algorithm starts performing poorly irrespective of A. 
Intriguingly, this threshold is not far from the critical point of the Ising model on a randomly diluted 
grid 0crit(p = O.3)«O.7|I!5l[l6l. 
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Figure 2: Learning uniformly random graphs of degree 4 from Ising models samples, using Rlr. 
Left: success probability as a function of the number of samples n for several values of 9. Right: 
the same data plotted for several choices of A versus 9 as in Fig.[Tl right panel. 



Figure |2] presents similar data when G is a uniformly random graph of degree A = 4, over p = 50 
vertices. The evolution of the success probability with n clearly shows a dichotomy. When 9 is 
below a threshold, a small number of samples is sufficient to reconstruct G with high probability. 
Above the threshold even n — 10'' samples are to few. In this case we can predict the threshold 
analytically, cf. Lemma [33] below, and get 0thr(A — 4) Ri 0.4203, which compares favorably with 
the data. 



3 Proofs 

In order to prove Theorem IL6I we need a few auxiliary results. It is convenient to introduce some 
notations. If AI is a matrix and R, P are index sets then Mn p denotes the submatrix with row 
indices in R and column indices in P. As above, we let r be the vertex whose neighborhood we are 
trying to reconstruct and define S — dr, S'^ — V \ dr LI r. Since the cost function L(^; {a;(^)}) + 
A||^||i only depend on ^ through its components'^. — {0^^}, we will hereafter neglect all the other 
parameters and write ' as a shorthand of 9^ .. 

Let z* be a subgradient of 1 10| 1 1 evaluated at the true parameters values, 9_* ~ {9rj : 9ij = 0, Vj ^ 
dr,9rj = 9, Vj G dr}. Let 9^ be the parameter estimate returned by Rlr( A) when the number 
of samples is n. Note that, since we assumed r > 0, = 1. Define g"(0, ; {x^^^}) to be the 
Hessian of L{9; {a;(^)}) and Q{9) = lim„^oo Q"(£, ; {s^^^}). By the law of large numbers Q{9) is 
the Hessian of Ece log PG,e{Xr\X\r) where Eg e is the expectation with respect to (O and X is a 
random variable distributed according to ([SJ. We will denote the maximum and minimum eigenvalue 
of a symmetric matrix M by (Tmax(A^) and crmin(-^^) respectively. 

We will omit arguments whenever clear from the context. Any quantity evaluated at the true pa- 
rameter values will be represented with a *, e.g. Q* — Q{9*). Quantities under a A depend on n. 
Throughout this section G is a graph of maximum degree A. 

3.1 Proof of TheoremlH] 

Our first auxiliary results establishes that, if A is small, then \ \Q*scgQ*gg'^^z*g | |oo > 1 is a sufficient 
condition for the failure of Rlr (A). 

Lemma 3.1. Assume [Q*s<isQ*ss^^ ^*s\i — ^ + ^forsomee > and some row i £ V, crn\in{Qss) — 
Gmin > 0, and A < ^/C^~^J¥A^. Then the success probability of Rlr(A) is upper bounded as 

Psucc < 4A2e""''^ + 2A e-^^^^B (10) 
where 5a = {Cl^JlQQ^'')e and 5b = (G„,i„/8A)e. 
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The next Lemma implies that, for A to be 'reasonable' (in the sense introduced in Section ll.2.2l i. 
nX^ must be unbounded. 

Lemma 3.2. There exist M = M{K, 9) > Q for 9 > such that the following is true: If G is the 
graph with only one edge between nodes r and i and nX^ < K, then 

p <„-Af(K,e)p , ~n(l-tanhe)V32 



Finally, our key result shows that the condition \\Qs<:sQ*ss~^ ^s\\°° < 1 is violated with high 
probability for large random graphs. The proof of this result relies on a local weak convergence 
result for ferromagnetic Ising models on random graphs proved in ifTTll . 

Lemma 3.3. Let G be a uniformly random regular graph of degree A > 3, and e > be sufficiently 
small. Then, there exists 9thr{^,^) such that, for 9 > 9t\a{/^,i), \\Q*s<:sQ*ss^^ ^*s\\°° — ^ + £with 
probability converging to 1 as p oo. 

Furthermore, for large A, 0thr(A,O+) = 0A^^(1 + o(l)). The constant 9 is given by 9 — 
tanh and h is the unique positive solution ofhtanhh = (1 — tanh h)'^. Finally, there exist 
Cmin > dependent only on A and 9 such that crniiniQ*ss) — '-^min with probability converging to 
1 as p ^ oo. 

The proofs of Lemmas |3.1| and |3.3| are sketched in the next subsection. Lemma [X2l is more straight- 
forward and we omit its proof for space reasons. 

Proof. (Theorem |1.6t Fix A > 3, 9 > K/A (where if is a large enough constant independent of 
A), and e, Cmin > and both small enough. By Lemma [331 for any p large enough we can choose 
a A-regular graph Gp = {V = [p], Ep) and a vertex r £ V such that \QscsQss^^'^s\i > 1 + e for 
some i £ V\r. 

By Theorem 1 in |4| we can assume, without loss of generality n > K'Alogp for some small 
constant K' . Further by Lemma [J!2l nX^ > F{p) for some F{p) f oo as p ^ oo and the condition 
of Lemma [TT] on A is satisfied since by the "reasonable" assumption A ^ with n. Using these 
results in Eq. ( fTOl i of Lemma lTTI we get the following upper bound on the success probability 

Psucc(Gp) < 4A2p-'5i-^''^ + 2A e~"-^(P)''« . (12) 
In particular Psucc(Gp) ^ as p ^ oo. □ 

3.2 Proofs of auxiliary lemmas 

Proof. (Lemma lTTT l We will show that under the assumptions of the lemma and if ^ = {9_g , 9_gc ) = 
{Q.S-, 0) then the probabiUty that the i component of any subgradient of L{9^, {a;^^-'})+A||^| |i vanishes 
for any 9_g > Q (component wise) is upper bounded as in Eq. ( fTOl i. To simplify notation we will omit 
{x^^-'} in all the expression derived from L. 

Let z be a subgradient of | \9\ \ at 9_ and assume S7L{9_) + Xz — 0. An application of the mean value 
theorem yields 

v^L(r)[0-r] = w^"-Az + i?", (13) 

where W^" = -^V L{9*) and [R% = [V^L^^^^) - V2L(r)]J(^- r) with^^^' a point in the line 

from 9_ to 0*. Notice that by definition V'^L{9^) = Q"* = To simpHfy notation we will 

omit the * in all Q"*. All Q" in this proof are thus evaluated at 

Breaking this expression into its 5* and S"^ components and since 9_gc — 9^gc = we can eliminate 
^ S.*s from the two expressions obtained and write 

[WSo - i?Sc] - Q'^sosiQssr'm " Rs] + ^QsosiQssr'^s = Xzsc ■ (14) 
Now notice that QgcsiQss)^^ = Ti + T2 + T3 + r4 where 

Tl ^ QgCgiiQss) ^ - {Q*SS) ^] ' T2 = [QgCg - Q*sCg]Q*SS ^, 

^ [Qgcg — Q*gCg][{Qss) ^^iQ*Ss) ^] ^ T4~Q*gcgQ*ss ^■ 
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We will assume that the samples {x^^^} are such that the following event holds 

£ ^ {WQss - Qsslloo < U,\\Qscs - QhsWoo < ^B,\m/M\oo < Cc} , (15) 

where = C^i,e/(16A), = Cn^ine/iSVA) and = anine/(8A). Since Eg,9(Q") = Q* 
and Kcfi {W") = and noticing that both Q" and are sums of bounded i.i.d. random variables, 
a simple application of Azuma-Hoeffding inequality upper bounds the probability of £ as in (fTol i. 

From £ it follows that crmin(Q55) > crnun{Q*ss) ^ C'min/S > C,nin/2. We Can therefore lower 
bound the absolute value of the i"^ component of zgc by 

\[Q*scsQ*ss ^l5]i| — llri^illoo — ||72,i||oo — llTa^illoo- 
where the subscript i denotes the i-th row of a matrix. 

The proof is completed by showing that the event £ and the assumptions of the theorem imply that 
each of last 7 terms in this expression is smaller than e/8. Since \ [QgcgQss~^^]I^s\ — 1 + e by 
assumption, this implies \zi\ > 1 + e/8 > 1 which cannot be since any subgradient of the 1-norm 
has components of magnitude at most 1. 

The last condition on £ immediately bounds all terms involving W by e/8. Some straightforward 
manipulations imply (See Lemma 7 from [7|) 
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||ri,i||oo < — WQ'SS ~ Qsslloo , ||T'2,i||oo < ll['9sC'5 - Q*sc s]i\\a2 , 

2A „ ^ 

llTa.illoo < -T^ — WQss ~ QsS I loo 1 1 [Qs^S ~ Q*scs\i\\^ > 
mill 

and thus all will be bounded by e/8 when £ holds. The upper bound of i?" follows along similar 
lines via an mean value theorem, and is deferred to a longer version of this paper. □ 

Proof. (Lemma [33] ) Let us state explicitly the local weak convergence result mentioned in Sec. 13. II 
For t G IN, let T(i) = (Vj, S't) be the regular rooted tree of t generations and define the associated 
Ising measure as 

AM)-^ n ^'^''^ n (16) 

"^'^ (»j)6£;t »G9T(t) 

Here 9T(t) is the set of leaves of T(t) and h* is the unique positive solution of /i = (A — 
1) atanhjtanh^ tanhft,}. It can be proved using |fT7l| and uniform continuity with respect to the 
'external field' that non-trivial local expectations with respect to nceisi) converge to local expecta- 
tions with respect to i^j g{x), as p ^ oo. 

More precisely, let (t) denote a ball of radius t around node r £ G (the node whose neighborhood 
we are trying to reconstruct). For any fixed t, the probability that Br(i) is not isomorphic to T{t) 
goes to as p CX3. Let g{xig^(t)) be any function of the variables in Br(t) such that g{xg.r-{t)) — 
9{—XBr{t))- Then almost surely over graph sequences Gp of uniformly random regular graphs with 
p nodes (expectations here are taken with respect to the measures ([T]i and ( fTSI l) 

lim EoAgiXBAt))} = ^m.e.+{9{2Lj^t))} ■ (17) 

The proof consists in considering [Qs^sQss^^ ^s\i ^ ~ dist(r, i) finite. We then write 
iQ*Ss)ik = E{5i,fc(ZB^(,))} and {Q*sas),i = K{9i,l{^B^^,^)} for some functions <7-, (^g„,,) ) and 
apply the weak convergence result ( [TtI i to these expectations. We thus reduced the calculation of 
[Qs''sQ*SS^^^s\i '■O '^he calculation of expectations with respect to the tree measure ( fTSI l. The latter 
can be implemented explicitly through a recursive procedure, with simplifications arising thanks to 
the tree symmetry and by taking t ^ 1. The actual calculations consist in a (very) long exercise in 
calculus and we omit them from this outline. 

The lower bound on (Jniin{Q*ss) is proved by a similar calculation. □ 
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